Combinatorial Bandits

نویسندگان

  • Nicolò Cesa-Bianchi
  • Gábor Lugosi
چکیده

We study sequential prediction problems in which, at each time instance, the forecaster chooses a vector from a given finite set S ⊆ R. At the same time, the opponent chooses a “loss” vector in R and the forecaster suffers a loss that is the inner product of the two vectors. The goal of the forecaster is to achieve that, in the long run, the accumulated loss is not much larger than that of the best possible element in S. We consider the “bandit” setting in which the forecaster only has access to the losses of the chosen vectors (i.e., the entire loss vectors are not observed). We introduce a variant of a strategy by Dani, Hayes, and Kakade achieving a regret bound that, for a variety of concrete choices of S, is of order √ nd ln |S| where n is the time horizon. This is not improvable in general and is better than previously known bounds. The examples we consider are all such that S ⊆ {0, 1}, and we show how the combinatorial structure of these classes can be exploited to improve the regret bounds. We also point out computationally efficient implementations for various interesting choices of S.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Matroid Bandits: Practical Large-Scale Combinatorial Bandits

A matroid is a notion of independence that is closely related to computational efficiency in combinatorial optimization. In this work, we bring together the ideas of matroids and multiarmed bandits, and propose a new class of stochastic combinatorial bandits, matroid bandits. A key characteristic of this class is that matroid bandits can be solved both computationally and sample efficiently. We...

متن کامل

Semi-Bandits with Knapsacks

We unify two prominent lines of work on multi-armed bandits: bandits with knapsacks and combinatorial semi-bandits. The former concerns limited “resources” consumed by the algorithm, e.g., limited supply in dynamic pricing. The latter allows a huge number of actions but assumes combinatorial structure and additional feedback to make the problem tractable. We define a common generalization, supp...

متن کامل

Schemata Bandits for Binary Encoded Combinatorial Optimisation Problems

We introduce the schema bandits algorithm to solve binary combinatorial optimisation problems, like the trap functions and NK landscape, where potential solutions are represented as bit strings. Schema bandits are influenced by two different areas in machine learning, evolutionary computation and multiarmed bandits. The schemata from the schema theorem for genetic algorithms are structured as h...

متن کامل

Matroid Bandits: Fast Combinatorial Optimization with Learning

A matroid is a notion of independence in combinatorial optimization which is closely related to computational efficiency. In particular, it is well known that the maximum of a constrained modular function can be found greedily if and only if the constraints are associated with a matroid. In this paper, we bring together the ideas of bandits and matroids, and propose a new class of combinatorial...

متن کامل

Cascading Bandits

The cascade model is a well-established model of user interaction with content. In this work, we propose cascading bandits, a learning variant of the model where the objective is to learn K most attractive items out of L ground items. We cast the problem as a stochastic combinatorial bandit with a non-linear reward function and partially observed weights of items. Both of these are challenging ...

متن کامل

Combinatorial Cascading Bandits

We propose combinatorial cascading bandits, a class of partial monitoring problems where at each step a learning agent chooses a tuple of ground items subject to constraints and receives a reward if and only if the weights of all chosen items are one. The weights of the items are binary, stochastic, and drawn independently of each other. The agent observes the index of the first chosen item who...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • J. Comput. Syst. Sci.

دوره 78  شماره 

صفحات  -

تاریخ انتشار 2009